Friedrich von Henke, Ulm University [head of department]
Thorsten Liebig, Ulm University & derivo GmbH, thorsten.liebig@uni-ulm.de [PRIMARY contact]
Olaf Noppens, Ulm University & derivo GmbH, olaf.noppens@uni-ulm.de
Our approach is based on the VIScover system which combines semantic technologies with interactive exploration and visualization techniques. The provided social network and geospatial data has been enriched to a knowledge net (aka. ontology) which reflects implicitly given characteristics of the domain in terms of ontological axioms. For instance, that some of the relationships are symmetric, such as the flitter-contact-with amongst persons or the nearby relationship amongst cities. This is the source of an inference process which makes hidden knowledge explicitly available. We use W3C’s Web Ontology Language (OWL) [1] as knowledge representation language and RacerPro [2] as underlying reasoning system.
On top of this infrastructure we use our exploration and visualization system VISCover [3], [4] that displays significant correlations within a knowledge net in a clear and concise manner using modern visualization and analysis techniques. It visualizes hidden information dependencies, abstracts from detail information when needed, and automatically groups data by its meaning. Even long chains of correlations and causalities can easily be tracked. With respect to the social network data, VIScover allows to incrementally explore branches of the flitter connectivity network in parallel. Relationships are graphically depicted as clubs originating from the set of source entities to the relationship’s fillers. The "club visualization" metaphor of VISCover also displays quantities of relatedness and allows to qualify clubs with help of logical as well as numerical filters in an on-demand fashion.
[1] OWL
[2] http://www.racer-systems.com/
[3] I-Know 2008, Understanding Interlinked Data - Visualising, Exploring, and Analysing Ontologies
[4] AVI 2008, Realizing the Hidden - Interactive Visualization and Analysis of Large Volumes of Structured Data
Video:
Accompanying video to answers of MC2.1, MC2.3, MC2.4, and MC2.5.
ANSWERS:
MC2.1: Which of the two social
structures, A or B, most closely match the scenario you have identified in the
data?
A
MC2.2: Provide the social network structure you have identified as a tab delimitated file. It should contain the employee, one or more handler, any middle folks, and the localized leader with their international contacts. What are the Flitter names of the persons involved? Please identify only key connections (not all single links for example) as well as any other nodes related to the scenario (if any) you may have discovered that were not described in the two scenarios A and B above. Please name the file Flitter.txt and place it in the same directory as your index.htm file. Please see the format required in the Task Descriptions.
Please link (relative!) to the file here.
MC2.3: Characterize the difference between your social network and the closest social structure you selected (A or B). If you include extra nodes please explain how they fit in to your scenario or analysis.
We have identified a social network of type A. This network is shown in Figure 1.
From left to right it first shows the employees under suspection (depicted as circles within a disc labled "Suspected-Employee"). They qualify as suspects since they comply to the characterization of "having between 37 and 43 flitter contacts and at least three contacts with itself 30 to 40 flitter contacts". This set is computed (infered) by the underlying reasoner as a logical consequence of the formal ontological axioms defined within the concept Suspected-Employee. Figure 2 shows this definition (light-blue background) as rendered with our tool as well as all other concept definitions defined to solve this mini challenge.
Figure 3 show the graph after expanding the set of suspected employees with respect to their flitter contacts but before any further filter restrictions (i.e. before applying the Min30max40-contacts filter).
Since the handlers are flitter contact of the wanted employee they need to be within the club expansion of the 401 contacts (rightmost disk of figure 3). We furthermore know that these handlers have between 30 and 40 contacts themselves. Therefore, we can narrow the initial expansion by using the concept description Min30max40-contacts as filter (via drag-n-dop). As result of this filter this club then consists of 33 potential handlers. In order to identify the middleman (as well as the corresponding handlers) we have to look for their (the handlers) flitter contacts which we have to restrict to those having 3 to 5 further contacts (the three handler and one or two others - including Fearless Leader). We also know that Boris maintains contact to Fearless Leader which itself has well over 100 contacts - a restriction we can likewise define and apply as a constraining axiom (Contact-w-min100-contacts). Furthermore, we know that middleman Boris (an individual in the third disk) has contact to at least three individuals of the preceeding club. This sort of quantity (amount of related individuals of preceeding club) is shown by a number within the individuals circle in VIScover. It happens that we have only one individual that fullfills this condition, namely good. We tag this individual as likely middleman (colored red in the remainder).
To see if good really is the middleman we need to check whether his handlers (within the preceeding club) do not communicate among themselves. This can be done with help of a non-connectivity partitioning wrt. the flitters relationship within VIScover. Figure 4 shows all of these partitions of the handlers club. One can see that there is just one three-tuple partition (reitenspies, pettersson, kushnir), which are in contact to exactly one of the suspected employees, namely schaffter. These three are the handlers.
Now, in order to spot Fearless Leader we have to expand the contacts of good (aka Boris). Fearless Leader must be one of his five contacts. Since we know that Fearless Leader has more than 100 contacts we can further restrict the set of candidates with help of this filter (see concept Min100-contacts). In the resulting set only one individual remains, namely szemeredi (see Figure 1), the Fearless Leader (who itself maintains 256 flitter contacts).
Szenario B does not apply with the provided data. Within Figure 5 one can see a likewise expansion of the network as before but adapted to case B, where the three middleman are restricted to have contact to 3 to 4 people (handler, Fearless Leader and on or two others). These middleman share one common contact (Fearless Leader) with more than 100 own contacts. However, as one can see in Figure 5 there is no common contact (just one person irvin which is only related to one potential middlemen (rowan). Therfore, there is no evidence for case B (even if there is a chain of flitter contacts to schaffter, the employee of case A).
MC2.4: How is your hypothesis about the social structure in Part 1 supported by the city locations of Flovania? What part(s), if any, did the role of geographical information play in the social network of part one?
As shown in Figure 1 the employee and the handlers are from Prounov, the middleman from Kannvic, and Fearless Leader from Kouvnic -- all located in Flovania.
MC2.5: In general, how are the Flitter users dispersed throughout the cities of this challenge? Which of the surrounding countries may have ties to this criminal operation? Why might some be of more significant concern than others?
The overall distribution of persons to cities is depicted in Figure 1. Most people, exact 1998, do live in Koul.
With respect to the contacts of Fearless Leader (szemeredi) we see that he maintains contact to persons at almost all cities and countries. Figure 2 shows the distribution of his 256 contacts to the 12 cities as well as countries. It shows that he has contact to 99 persons in Koul, followed by 63 persons in Prounov, and 32 persons in Kouvnic. The first city not in Flavia is Otello with 7 contacts.